We use Python for this:
Use what your colleagues (tend to) use
To analyse and visualise experimental data
Tabular (comma-separated) data
We can do this with a little programming
Before we begin…
cd ~/Desktop
mkdir python-novice-inflammation
cd python-novice-inflammationLIVE DEMO
Before we begin…
cp 2017-03-23-standrews/lessons/python-01/data/python-novice-inflammation-data.zip ./
cp 2017-03-23-standrews/lessons/python-01/data/python-novice-inflammation-code.zip ./
unzip python-novice-inflammation-data.zip
unzip python-novice-inflammation-code.zip(you can download files via Etherpad)
(http://pad.software-carpentry.org/2017-03-23-standrews)
LIVE DEMO
JupyterAt the command-line, start Jupyter notebook:
jupyter notebookJupyter landing page
variables)
Jupyter documents are comprised of cellsJupyter cell can have one of several typesMarkdownMarkdown allows us to enter formatted text.Shift + EnterShift + Enter
name, containing "Samia"print() function shows the contents of a variable
weight_kg = 55
print(weight_kg)
2.2 * weight_kg
print("weight in pounds", 2.2 * weight_kg)
weight_kg = 57.5
print("weight in kilograms is now:", weight_kg)
weight_lb = 2.2 * weight_kg
print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb)
weight_kg = 100
print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb)What are the values in mass and age after the following code is executed?
mass = 47.5
age = 122
mass = mass * 2.0
age = age - 20mass == 47.5, age == 122mass == 95.0, age == 102mass == 47.5, age == 102mass == 95.0, age == 122What does the following code print out?
first, second = 'Grace', 'Hopper'
third, fourth = second, first
print(third, fourth)Hopper GraceGrace Hopper"Grace Hopper""Hopper Grace"Jupyter notebook or iPython terminal…%whos will show you all defined variables
data/inflammation-01.csv$ head data/inflammation-01.csv
0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0
0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1
0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1
0,0,2,0,4,2,2,1,6,7,10,7,9,13,8,8,15,10,10,7,17,4,4,7,6,15,6,4,9,11,3,5,6,3,3,4,2,3,2,1
0,1,1,3,3,1,3,5,2,4,4,7,6,5,3,10,8,10,6,17,9,14,9,7,13,9,12,6,7,7,9,6,3,2,2,4,2,0,1,1
0,0,1,2,2,4,2,1,6,4,7,6,6,9,9,15,4,16,18,12,12,5,18,9,5,3,10,3,12,7,8,4,7,3,5,4,4,3,2,1
0,0,2,2,4,2,2,5,5,8,6,5,11,9,4,13,5,12,10,6,9,17,15,8,9,3,13,7,8,2,8,8,4,2,3,5,4,1,1,1
0,0,1,2,3,1,2,3,5,3,7,8,8,5,10,9,15,11,18,19,20,8,5,13,15,10,6,10,6,7,4,9,3,5,2,5,3,2,2,1
0,0,0,3,1,5,6,5,5,8,2,4,11,12,10,11,9,10,17,11,6,16,12,6,8,14,6,13,10,11,4,6,4,7,6,3,2,1,0,0
0,1,1,2,1,3,5,3,5,8,6,8,12,5,13,6,13,8,16,8,18,15,16,14,12,7,3,8,9,11,2,5,4,5,1,4,1,2,0,0numpy libraryPython librariesPython contains many powerful, general toolsimportimport numpy
import seabornJUPYTER MAGICJupyter is through magic%pylab inline
import numpy
import seabornJupyter notebooksnumpy, seaborn, pylabnumpy: work with matrices and arrays in Pythonseaborn: attractive statistical summary graphspylab: numerical operations and visualisation in Python
Calling %pylab inline shows graphics within the notebook itself
numpy provides a function loadtxt() to load tabular data:numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')loadtxt() belongs to numpyfname: an argument expecting the path to a filedelimiter: an argument expecting the character that separates columns... indicate missing rows or columns1 == 1. == 1.0)datatype(data)
print(data.dtype)
print(data.shape)LIVE DEMO
datadata.<attribute> e.g. data.shapeprint('first value in data:', data[0, 0])
print('middle value in data:', data[30, 20])LIVE DEMO
: (colon).print(data[0:4, 0:10])
print(data[5:10, 0:10])LIVE DEMO
Python assumes the first elementPython assumes the end elementQUESTION: What would : on its own indicate?
small = data[:3, 36:]
print('small is:')
print(small)LIVE DEMO
We can take slices of any series, not just arrays.
element = 'oxygen'
print('first three characters:', element[0:3])
first three characters: oxyWhat is the value of element[:4]?
oxyggenoxyenarrays know how to perform operations on their values+, -, *, /, etc. are elementwisedoubledata = data * 2.0
print('original:')
print(data[:3, 36:])
print('doubledata:')
print(doubledata[:3, 36:])
tripledata = doubledata + data
print('tripledata:')
print(tripledata[:3, 36:])LIVE DEMO
numpy functionsnumpy provides functions to operate on arraysprint(numpy.mean(data))
maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data)
print('maximum inflammation:', maxval)
print('minimum inflammation:', minval)
print('standard deviation:', stdval)
maxval, minval, stdval = data.max(), data.min(), data.std()
print('maximum inflammation:', maxval)
print('minimum inflammation:', minval)
print('standard deviation:', stdval)LIVE DEMO
patient_0 = data[0, :] # Row zero only, all columns
print('maximum inflammation for patient 0:', patient_0.max())
print('maximum inflammation for patient 0:', numpy.max(data[0, :]))
print('maximum inflammation for patient 2:', numpy.max(data[2, :]))LIVE DEMO
numpy operations on axesnumpy functions take an axis= parameter: 0 (columns) or 1 (rows)print(numpy.max(data, axis=1))
print(data.mean(axis=0))LIVE DEMO
Here’s one I prepared earlier (for the Software Sustainability Institute):
matplotlibmatplotlib is the de facto standard plotting library in Pythonimported seaborn earlier, which makes matplotlib output nicer%pylab inline earlier, which puts matplotlib output in the notebookimport matplotlib.pyplot
image = matplotlib.pyplot.imshow(data)LIVE DEMO
matplotlib .imshow().imshow() renders matrix values as an imagematplotlib .plot().plot() renders a line graphave_inflammation = numpy.mean(data, axis=0)
ave_plot = matplotlib.pyplot.plot(ave_inflammation)LIVE DEMO
.mean() looks artificialmax_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0))
min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0))LIVE DEMO
Can you create a plot showing the standard deviation (numpy.std()) of the inflammation data for each day across all patients?
fig = matplotlib.pyplot.figure()ax = fig.add_subplot()ax.set_ylabel()ax.plot()LIVE DEMO
Can you modify the last plot to display the three graphs on top of one another, instead of side by side?
for loopsword = "lead"
print(word[0])
print(word[1])
print(word[2])
print(word[3])LIVE DEMO
for loopsfor loops perform actions for every item in a collectionword = "lead"
for char in word:
print(char)LIVE DEMO
for loopsfor element in collection:
<do things with element>for loop statement ends in a colon, :tab (\t)for loop cycleslength = 0
for vowel in 'aeiou':
length = length + 1
print('There are', length, 'vowels')LIVE DEMO
letter = 'z'
for letter in 'abc':
print(letter)
print('after the loop, letter is', letter)LIVE DEMO
range()range() function creates a sequence of numbersrange type that can be iterated over.range(3)
range(2, 5)
range(3, 10, 3)
for val in range(3, 10, 3):
print(val)LIVE DEMO
Python built-in: print(5 ** 3)Can you use a for loop to calculate 5 ** 3 using only multiplication?
Newton, and produces a new string with the characters in reverse order, e.g. notweN?enumerate()enumerate() function creates paired indices and values for elements of a sequenceenumerate("aeiou")
for idx, val in enumerate("aeiou"):
print(idx, val)\[y = a_0 + a_1 x + a_2 x^2 + a_3 x^3 + a_4 x^4\]
coeffs = [2, 4, 3, 2, 1]
lists are a built in Python datatypeodds = [1, 3, 5, 7]
print('odds are:', odds)
print('first and last:', odds[0], odds[-1])
for number in odds:
print(number)LIVE DEMO
lists, like strings, are sequenceslist elements can be changed: lists are mutablestrings are not mutablenames = ['Newton', 'Darwing', 'Turing'] # typo in Darwin's name
print('names is originally:', names)
names[1] = 'Darwin' # correct the name
print('final value of names:', names)
name = 'Darwin'
name[0] = 'd'lists in-placemy_list = [1, 2, 3, 4]
your_list = my_list
my_list[1] = 0
print("my list:", my_list)
print("your list:", your_list)LIVE DEMO
your_list?list copieslist by slicing it or using the list() functionnew_list = old_list[:]my_list = [1, 2, 3, 4]
your_list = my_list[:] # or list(my_list)
print("my list:", my_list)
print("your list:", your_list)
my_list[1] = 0
print("my list:", my_list)
print("your list:", your_list)LIVE DEMO
listslists can contain any datatype, even other listsx = [['pepper', 'zucchini', 'onion'],
['cabbage', 'lettuce', 'garlic'],
['apple', 'pear', 'banana']]LIVE DEMO
list functionslists are Python objects and have useful functionsodds.append(9)
print("odds after adding a value:", odds)
odds.reverse()
print("odds after reversing:", odds)
print(odds.pop())
print("odds after popping:", odds)LIVE DEMO
+) having more than one meaning, depending on the thing it operates on.vowels = ['a', 'e', 'i', 'o', 'u']
vowels_welsh = ['a', 'e', 'i', 'o', 'u', 'w', 'y']
print(vowels + vowels_welsh)
counts = [2, 4, 6, 8, 10]
repeats = counts * 2
print(repeats)+) and ‘multiplication’ (*) do for lists?